Goto

Collaborating Authors

 microscopic image


MMLNB: Multi-Modal Learning for Neuroblastoma Subtyping Classification Assisted with Textual Description Generation

Chen, Huangwei, Chen, Yifei, Yan, Zhenyu, Ding, Mingyang, Li, Chenlei, Zhu, Zhu, Qin, Feiwei

arXiv.org Artificial Intelligence

Neuroblastoma (NB), a leading cause of childhood cancer mortality, exhibits significant histopathological variability, necessitating precise subtyping for accurate prognosis and treatment. Traditional diagnostic methods rely on subjective evaluations that are time-consuming and inconsistent. To address these challenges, we introduce MMLNB, a multi-modal learning (MML) model that integrates pathological images with generated textual descriptions to improve classification accuracy and interpretability. The approach follows a two-stage process. First, we fine-tune a Vision-Language Model (VLM) to enhance pathology-aware text generation. Second, the fine-tuned VLM generates textual descriptions, using a dual-branch architecture to independently extract visual and textual features. These features are fused via Progressive Robust Multi-Modal Fusion (PRMF) Block for stable training. Experimental results show that the MMLNB model is more accurate than the single modal model. Ablation studies demonstrate the importance of multi-modal fusion, fine-tuning, and the PRMF mechanism. This research creates a scalable AI-driven framework for digital pathology, enhancing reliability and interpretability in NB subtyping classification. Our source code is available at https://github.com/HovChen/MMLNB.


AI-Driven Rapid Identification of Bacterial and Fungal Pathogens in Blood Smears of Septic Patients

Sroka-Oleksiak, Agnieszka, Pardyl, Adam, Rymarczyk, Dawid, Olechowska-Jarząb, Aldona, Biegun-Drożdż, Katarzyna, Ochońska, Dorota, Wronka, Michał, Borowa, Adriana, Gosiewski, Tomasz, Adamczyk, Miłosz, Telega, Henryk, Zieliński, Bartosz, Brzychczy-Włoch, Monika

arXiv.org Artificial Intelligence

Sepsis is a life-threatening condition which requires rapid diagnosis and treatment. Traditional microbiological methods are time-consuming and expensive. In response to these challenges, deep learning algorithms were developed to identify 14 bacteria species and 3 yeast-like fungi from microscopic images of Gram-stained smears of positive blood samples from sepsis patients. A total of 16,637 Gram-stained microscopic images were used in the study. The analysis used the Cellpose 3 model for segmentation and Attention-based Deep Multiple Instance Learning for classification. Our model achieved an accuracy of 77.15% for bacteria and 71.39% for fungi, with ROC AUC of 0.97 and 0.88, respectively. The highest values, reaching up to 96.2%, were obtained for Cutibacterium acnes, Enterococcus faecium, Stenotrophomonas maltophilia and Nakaseomyces glabratus. Classification difficulties were observed in closely related species, such as Staphylococcus hominis and Staphylococcus haemolyticus, due to morphological similarity, and within Candida albicans due to high morphotic diversity. The study confirms the potential of our model for microbial classification, but it also indicates the need for further optimisation and expansion of the training data set. In the future, this technology could support microbial diagnosis, reducing diagnostic time and improving the effectiveness of sepsis treatment due to its simplicity and accessibility. Part of the results presented in this publication was covered by a patent application at the European Patent Office EP24461637.1 "A computer implemented method for identifying a microorganism in a blood and a data processing system therefor".


Breaking Down the Hierarchy: A New Approach to Leukemia Classification

Hamdi, Ibraheem, El-Gendy, Hosam, Sharshar, Ahmed, Saeed, Mohamed, Ridzuan, Muhammad, Hashmi, Shahrukh K., Syed, Naveed, Mirza, Imran, Hussain, Shakir, Abdalla, Amira Mahmoud, Yaqub, Mohammad

arXiv.org Artificial Intelligence

The complexities inherent to leukemia, multifaceted cancer affecting white blood cells, pose considerable diagnostic and treatment challenges, primarily due to reliance on laborious morphological analyses and expert judgment that are susceptible to errors. Addressing these challenges, this study presents a refined, comprehensive strategy leveraging advanced deep-learning techniques for the classification of leukemia subtypes. We commence by developing a hierarchical label taxonomy, paving the way for differentiating between various subtypes of leukemia. The research further introduces a novel hierarchical approach inspired by clinical procedures capable of accurately classifying diverse types of leukemia alongside reactive and healthy cells. An integral part of this study involves a meticulous examination of the performance of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) as classifiers. The proposed method exhibits an impressive success rate, achieving approximately 90\% accuracy across all leukemia subtypes, as substantiated by our experimental results. A visual representation of the experimental findings is provided to enhance the model's explainability and aid in understanding the classification process.


Recognition of Harmful Phytoplankton from Microscopic Images using Deep Learning

Khaldi, Aymane, Khaldi, Rohaifa

arXiv.org Artificial Intelligence

Monitoring plankton distribution, particularly harmful phytoplankton, is vital for preserving aquatic ecosystems, regulating the global climate, and ensuring environmental protection. Traditional methods for monitoring are often time-consuming, expensive, error-prone, and unsuitable for large-scale applications, highlighting the need for accurate and efficient automated systems. In this study, we evaluate several state-of-the-art CNN models, including ResNet, ResNeXt, DenseNet, and EfficientNet, using three transfer learning approaches: linear probing, fine-tuning, and a combined approach, to classify eleven harmful phytoplankton genera from microscopic images. The best performance was achieved by ResNet-50 using the fine-tuning approach, with an accuracy of 96.97%. The results also revealed that the models struggled to differentiate between four harmful phytoplankton types with similar morphological features.


Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis

Srinivas, Sakhinana Sagar, Ravuru, Chidaksh, Sannidhi, Geethan, Runkana, Venkataramana

arXiv.org Artificial Intelligence

Semiconductors, crucial to modern electronics, are generally under-researched in foundational models. It highlights the need for research to enhance the semiconductor device technology portfolio and aid in high-end device fabrication. In this paper, we introduce sLAVA, a small-scale vision-language assistant tailored for semiconductor manufacturing, with a focus on electron microscopy image analysis. It addresses challenges of data scarcity and acquiring high-quality, expert-annotated data. We employ a teacher-student paradigm, using a foundational vision language model like GPT-4 as a teacher to create instruction-following multimodal data for customizing the student model, sLAVA, for electron microscopic image analysis tasks on consumer hardware with limited budgets. Our approach allows enterprises to further fine-tune the proposed framework with their proprietary data securely within their own infrastructure, protecting intellectual property. Rigorous experiments validate that our framework surpasses traditional methods, handles data shifts, and enables high-throughput screening.


Automated Quantification of White Blood Cells in Light Microscopic Images of Injured Skeletal Muscle

Jiao, Yang, Derakhshan, Hananeh, Schneider, Barbara St. Pierre, Regentova, Emma, Yang, Mei

arXiv.org Artificial Intelligence

White blood cells (WBCs) are the most diverse cell types observed in the healing process of injured skeletal muscles. In the course of healing, WBCs exhibit dynamic cellular response and undergo multiple protein expression changes. The progress of healing can be analyzed by quantifying the number of WBCs or the amount of specific proteins in light microscopic images obtained at different time points after injury. In this paper, we propose an automated quantifying and analysis framework to analyze WBCs using light microscopic images of uninjured and injured muscles. The proposed framework is based on the Localized Iterative Otsu's threshold method with muscle edge detection and region of interest extraction. Compared with the threshold methods used in ImageJ, the LI Otsu's threshold method has high resistance to background area and achieves better accuracy. The CD68-positive cell results are presented for demonstrating the effectiveness of the proposed work.


Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption

Srinivas, Sakhinana Sagar, Ravuru, Chidaksh, Sannidhi, Geethan, Runkana, Venkataramana

arXiv.org Artificial Intelligence

Semiconductor imaging and analysis are critical yet understudied in deep learning, limiting our ability for precise control and optimization in semiconductor manufacturing. We introduce a small-scale multimodal framework for analyzing semiconductor electron microscopy images (MAEMI) through vision-language instruction tuning. We generate a customized instruction-following dataset using large multimodal models on microscopic image analysis. We perform knowledge transfer from larger to smaller models through knowledge distillation, resulting in improved accuracy of smaller models on visual question answering (VQA) tasks. This approach eliminates the need for expensive, human expert-annotated datasets for microscopic image analysis tasks. Enterprises can further finetune MAEMI on their intellectual data, enhancing privacy and performance on low-cost consumer hardware. Our experiments show that MAEMI outperforms traditional methods, adapts to data distribution shifts, and supports high-throughput screening.


SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms

Chen, Yifei, Zhu, Zhu, Zhu, Shenghao, Qiu, Linwei, Zou, Binfeng, Jia, Fan, Zhu, Yunpeng, Zhang, Chenyan, Fang, Zhaojie, Qin, Feiwei, Fan, Jin, Wang, Changmiao, Gao, Yu, Yu, Gang

arXiv.org Artificial Intelligence

The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redundant feature extraction when processing high-dimensional microimage data. We propose a novel fine-grained classification model, SCKansformer, for bone marrow blood cells, which addresses these challenges and enhances classification accuracy and efficiency. The model integrates the Kansformer Encoder, SCConv Encoder, and Global-Local Attention Encoder. The Kansformer Encoder replaces the traditional MLP layer with the KAN, improving nonlinear feature representation and interpretability. The SCConv Encoder, with its Spatial and Channel Reconstruction Units, enhances feature representation and reduces redundancy. The Global-Local Attention Encoder combines Multi-head Self-Attention with a Local Part module to capture both global and local features. We validated our model using the Bone Marrow Blood Cell Fine-Grained Classification Dataset (BMCD-FGCD), comprising over 10,000 samples and nearly 40 classifications, developed with a partner hospital. Comparative experiments on our private dataset, as well as the publicly available PBC and ALL-IDB datasets, demonstrate that SCKansformer outperforms both typical and advanced microcell classification methods across all datasets. Our source code and private BMCD-FGCD dataset are available at https://github.com/JustlfC03/SCKansformer.


Geometric Data Augmentations to Mitigate Distribution Shifts in Pollen Classification from Microscopic Images

Cao, Nam, Saukh, Olga

arXiv.org Artificial Intelligence

Distribution shifts are characterized by differences between the training and test data distributions. They can significantly reduce the accuracy of machine learning models deployed in real-world scenarios. This paper explores the distribution shift problem when classifying pollen grains from microscopic images collected in the wild with a low-cost camera sensor. We leverage the domain knowledge that geometric features are highly important for accurate pollen identification and introduce two novel geometric image augmentation techniques to significantly narrow the accuracy gap between the model performance on the train and test datasets. In particular, we show that Tenengrad and ImageToSketch filters are highly effective to balance the shape and texture information while leaving out unimportant details that may confuse the model. Extensive evaluations on various model architectures demonstrate a consistent improvement of the model generalization to field data of up to 14% achieved by the geometric augmentation techniques when compared to a wide range of standard image augmentations. The approach is validated through an ablation study using pollen hydration tests to recover the shape of dry pollen grains. The proposed geometric augmentations also receive the highest scores according to the affinity and diversity measures from the literature.


Topologically-Regularized Multiple Instance Learning for Red Blood Cell Disease Classification

Kazeminia, Salome, Sadafi, Ario, Makhro, Asya, Bogdanova, Anna, Marr, Carsten, Rieck, Bastian

arXiv.org Artificial Intelligence

Diagnosing rare anemia disorders using microscopic images is challenging for skilled specialists and machine-learning methods alike. Due to thousands of disease-relevant cells in a single blood sample, this constitutes a complex multiple-instance learning (MIL) problem. While the spatial neighborhood of red blood cells is not meaningful per se, the topology, i.e., the geometry of blood samples as a whole, contains informative features to remedy typical MIL issues, such as vanishing gradients and overfitting when training on limited data. We thus develop a topology-based approach that extracts multi-scale topological features from bags of single red blood cell images. The topological features are used to regularize the model, enforcing the preservation of characteristic topological properties of the data. Applied to a dataset of 71 patients suffering from rare anemia disorders with 521 microscopic images of red blood cells, our experiments show that topological regularization is an effective method that leads to more than 3% performance improvements for the automated classification of rare anemia disorders based on single-cell images. This is the first approach that uses topological properties for regularizing the MIL process.